-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
planner, executor: implement the null-aware antiSemiJoin and null-aware antiLeftOuterSemiJoin (hash join with inner build) #37512
Conversation
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
Code Coverage Details: https://codecov.io/github/pingcap/tidb/commit/9974aefb74ef8f5cc8106b960bd8d967a415eefd |
Here is the issue: #37525 |
/run-mysql-test tidb-test=pr/1953 |
4 similar comments
/run-mysql-test tidb-test=pr/1953 |
/run-mysql-test tidb-test=pr/1953 |
/run-mysql-test tidb-test=pr/1953 |
/run-mysql-test tidb-test=pr/1953 |
@@ -142,6 +142,7 @@ type LogicalJoin struct { | |||
preferJoinOrder bool | |||
|
|||
EqualConditions []*expression.ScalarFunction | |||
NAEQConditions []*expression.ScalarFunction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it be better if we use a bool value to tell whether is null aware or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Step2: when step1 is finished, then we can determine whether we need to extract NA-EQ from OtherCondition to NAEQConditions.
// when there are still no EqualConditions, let's try to be a NAAJ.
In where the comment is, we only fill the NAEQConditions when there are no common EqualConditions.
@@ -575,12 +646,84 @@ func (j *leftOuterSemiJoiner) Clone() joiner { | |||
return &leftOuterSemiJoiner{baseJoiner: j.baseJoiner.Clone()} | |||
} | |||
|
|||
type nullAwareAntiLeftOuterSemiJoiner struct { | |||
baseJoiner |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we put the NAAJType as a member variable of the struct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On only the OnMisMatch(), we will fill it with 1
. quite different like before, handling null in onMatch is a kind of acceleration of execution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like null-aware join is so different from the normal join, this pr introduce too many if isNA
related code, even the top interface tryMatchInners/onMatch
are polluted. I wonder that comparing to the current implementation, if it can be more clear by adding a new NAHashJoinExec instead? @XuHuaiyu what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, i'm also thinking whether we could split to two join exec that extract the comment part as an interface and baseHashJoinExec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems reasonable, the code size will expand dramatically in this pr (some copy work is unavoidable even if extracting an interface for baseHashJoinExec), we can put the clear work in the next pr?
/run-mysql-test tidb-test=pr/1953 |
4 similar comments
/run-mysql-test tidb-test=pr/1953 |
/run-mysql-test tidb-test=pr/1953 |
/run-mysql-test tidb-test=pr/1953 |
/run-mysql-test tidb-test=pr/1953 |
// IsNullEQ is used for cases like Except statement where null key should be matched with null key. | ||
// <1,null> is exactly matched with <1,null>, where the null value should not be filtered and | ||
// the null is exactly matched with null only. (while in NAAJ null value should also be matched | ||
// with other non-null item as well) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused, null value matched with non-null values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, null in (y set), for lhs, it doesn't care what the y actually is.
additional, when match <1, null> probe row to hash table, what are you ganna fetch from hash-table? the pattern should be <1, whatever>
delete from naaj_B; | ||
insert into naaj_B values(2, 2, 2); | ||
select (a, b) not in (select a, b from naaj_B) from naaj_A; | ||
select * from naaj_A where (a, b) not in (select a, b from naaj_B); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have cases where naaj_A has (null, null, null) tuple?
1 NULL 1 | ||
explain format = 'brief' select (a, b) not in (select a, b from naaj_B where naaj_A.c = naaj_B.c) from naaj_A; | ||
id estRows task access object operator info | ||
HashJoin 10000.00 root anti left outer semi join, equal:[eq(test.naaj_a.c, test.naaj_b.c)], other cond:eq(test.naaj_a.a, test.naaj_b.a), eq(test.naaj_a.b, test.naaj_b.b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the other condition eq(test.naaj_a.a, test.naaj_b.a), eq(test.naaj_a.b, test.naaj_b.b)
a normal eq condition or need to be null aware?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it does.
here is the old path, since it already has the correlated EQ condition "naaj_A.c = naaj_B.c", which could be used as a hash join key. So the na-eq condition should be put in the other condition as it was before. (mixture EQ cond and NA-EQ cond in hash join key is little tricky)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be future improvements. We can open another issue to track this optimization (EQ and NAEQ as the hash keys).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sense
cmd/explaintest/r/subquery.result
Outdated
@@ -55,7 +55,7 @@ insert into exam values(1, 'math', 100); | |||
set names utf8 collate utf8_general_ci; | |||
explain format = 'brief' select * from stu where stu.name not in (select 'guo' from exam where exam.stu_id = stu.id); | |||
id estRows task access object operator info | |||
Apply 10000.00 root CARTESIAN anti semi join, other cond:eq(test.stu.name, Column#8) | |||
Apply 10000.00 root Null-aware anti semi join, equal:[eq(test.stu.name, Column#8)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious why this sql need using Apply
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting, collation block this, see issue #37032
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in apply mode, even there is a na-eq here, it will be appended as other conds in running
@@ -1972,7 +1993,10 @@ func (b *executorBuilder) buildApply(v *plannercore.PhysicalApply) Executor { | |||
if b.err != nil { | |||
return nil | |||
} | |||
otherConditions := append(expression.ScalarFuncs2Exprs(v.EqualConditions), v.OtherConditions...) | |||
// test is in the explain/naaj.test#part5. | |||
// although we prepared the NAEqualConditions, but for Apply mode, we still need move it to other conditions like eq condition did here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So Apply
does not need care about null/empty
things?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, it does, it feels the null & empty of other conditions(na-eq condition is marked) in joiner logic. doesn't like hash join, you can do this at the matching phase before joiner detail works.
@@ -22,7 +22,7 @@ | |||
{ | |||
"SQL": "select * from t1 where (t1.a, t1.b) not in (select a, b from t2)", | |||
"Plan": [ | |||
"HashJoin 3.20 root CARTESIAN anti semi join, other cond:eq(test.t1.a, test.t2.a), eq(test.t1.b, test.t2.b)", | |||
"HashJoin 3.20 root Null-aware anti semi join, equal:[eq(test.t1.b, test.t2.b) eq(test.t1.a, test.t2.a)]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why condition order is changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, some misorder before (>,<), but it doesn't affect the correctness by now.
planner/core/stats.go
Outdated
leftKeyNDV := getColsNDV(h.leftJoinKeys, h.leftSchema, h.leftProfile) | ||
rightKeyNDV := getColsNDV(h.rightJoinKeys, h.rightSchema, h.rightProfile) | ||
var leftKeyNDV, rightKeyNDV float64 | ||
if len(h.leftJoinKeys) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If both leftJoinKyes
and leftNAJoinKeys
are non-empty, the ndv will be based only on the leftJoinKyes
, is it a by-design behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
len(h.leftJoinKeys) > 0 || len(h.rightJoinKeys) > 0, I want this actually, I take it for granted that left keys are always same size of right keys. (join key and na-join key is mutually exclusive)
fixed
@@ -1942,6 +1950,8 @@ func (p *LogicalJoin) tryToGetMppHashJoin(prop *property.PhysicalProperty, useBC | |||
return nil | |||
} | |||
lkeys, rkeys, _, _ := p.GetJoinKeys() | |||
lNAkeys, rNAKeys := p.GetNAJoinKeys() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since mpp does not support na-keys, looks like we should return nil once na-keys is not empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, fixed
@@ -1809,6 +1809,11 @@ func (p *LogicalJoin) shouldUseMPPBCJ() bool { | |||
return checkChildFitBC(p.children[0]) || checkChildFitBC(p.children[1]) | |||
} | |||
|
|||
// canPushToCop checks if it can be pushed to some stores. | |||
func (p *LogicalJoin) canPushToCop(storeTp kv.StoreType) bool { | |||
return len(p.NAEQConditions) == 0 && p.baseLogicalPlan.canPushToCop(storeTp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will be some compatible issues here:
Before this pr, for sql like select * from t where nullable_col not in (select xx from t1)
, if the subquery meet the broadcast threshold, it can be pushed to TiFlash, but after this pr, there is no way to pushdown such join to TiFlash? I think before TiFlash's support of naaj is ready, it's better to add a switch to let user decide to use naaj or cross join
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it will be controlled by the switch before tiflash supports the exec logic. Just forbid pushing down for now is okay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, after the switch, the len(p.NAEQConditions) == 0 should exactly be 0 here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Signed-off-by: AilinKid <314806019@qq.com>
needCheckBuildRowPos = append(needCheckBuildRowPos, c.hCtx.naKeyColIdx[i]) | ||
needCheckProbeRowPos = append(needCheckProbeRowPos, probeHCtx.naKeyColIdx[i]) | ||
} | ||
// check the idxs-th value of the join columns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does idxs
here mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
join column(a,b,c) (e,f,g)
find the non-null bit, for example, (b,c) - (e,f)
use the b,c's, and e,f's column index to fetch the value from the chunk to check the same.
the idxs (means b,c's column index)/(e,f's column index)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like needCheckProbeRowPos
and needCheckBuildRowPos
can be generated outside the loop since it only related to probeKeyNullBits
?
@@ -575,12 +646,84 @@ func (j *leftOuterSemiJoiner) Clone() joiner { | |||
return &leftOuterSemiJoiner{baseJoiner: j.baseJoiner.Clone()} | |||
} | |||
|
|||
type nullAwareAntiLeftOuterSemiJoiner struct { | |||
baseJoiner |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, i'm also thinking whether we could split to two join exec that extract the comment part as an interface and baseHashJoinExec.
@@ -1809,6 +1809,11 @@ func (p *LogicalJoin) shouldUseMPPBCJ() bool { | |||
return checkChildFitBC(p.children[0]) || checkChildFitBC(p.children[1]) | |||
} | |||
|
|||
// canPushToCop checks if it can be pushed to some stores. | |||
func (p *LogicalJoin) canPushToCop(storeTp kv.StoreType) bool { | |||
return len(p.NAEQConditions) == 0 && p.baseLogicalPlan.canPushToCop(storeTp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it will be controlled by the switch before tiflash supports the exec logic. Just forbid pushing down for now is okay?
Co-authored-by: Yiding Cui <winoros@gmail.com>
needCheckBuildRowPos = append(needCheckBuildRowPos, c.hCtx.naKeyColIdx[i]) | ||
needCheckProbeRowPos = append(needCheckProbeRowPos, probeHCtx.naKeyColIdx[i]) | ||
} | ||
// check the idxs-th value of the join columns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like needCheckProbeRowPos
and needCheckBuildRowPos
can be generated outside the loop since it only related to probeKeyNullBits
?
hashVals []hash.Hash64 | ||
hasNull []bool | ||
naHasNull []bool | ||
naColNullBitMap []*bitmap.ConcurrentBitmap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why need a concurrentBitmap here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it's unnecessary here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's unnecessary, I think we should not use concurrentBitMap here, please use a normal bitmap instead.
hCtx.initHash(probeSideChk.NumRows()) | ||
numRows := probeSideChk.NumRows() | ||
hCtx.initHash(numRows) | ||
// By now, path 1 and 2 won't be conducted at the same time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then why not use if isNAAJ
to make sure path 1 and path 2 could not be conducted at the same time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cause we guarantee the EQcondition and NAEQcondition couldn't be filled at same time before.
Yes it could, for the same code style of |
Signed-off-by: AilinKid <314806019@qq.com>
Signed-off-by: AilinKid <314806019@qq.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/merge |
This pull request has been accepted and is ready to merge. Commit hash: 3f3f9b7
|
/merge |
/merge |
/run-cherry-picker |
Signed-off-by: ti-srebot <ti-srebot@pingcap.com>
cherry pick to release-6.2 in PR #37980 |
TiDB MergeCI notify🔴 Bad News! New failing [1] after this pr merged.
|
Signed-off-by: AilinKid 314806019@qq.com
What problem does this PR solve?
Issue Number: close #37525
Problem Summary:
What is changed and how it works?
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.